Critical Care Explorations
○ Ovid Technologies (Wolters Kluwer Health)
Preprints posted in the last 90 days, ranked by how well they match Critical Care Explorations's content profile, based on 15 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Landry, T. C.; Kim, Y.
Show abstract
Background. Capillary refill time is a resuscitation target in septic shock,1-4 but bedside measurement is examiner-dependent. An ICU monitor co-records a photoplethysmogram on the pulse oximeter and intermittent noninvasive blood pressure cuff cycles; if the probe and the cuff share a limb, each cycle is an unplanned vascular occlusion test on the distal microvascular bed. Standard practice places the two on opposite limbs. Objective. To measure how often, in MIMIC-IV-WDB v0.1.0, charted cuff cycles show the photoplethysmographic morphology expected of a same-limb cuff and probe, and to characterize the candidate capillary refill-like signal when that morphology is present. Methods. MIMIC-IV-WDB v0.1.05 was linked to the MIMIC-IV clinical database.6 A pre-registered rule-based detector identified candidate occlusion-reperfusion signatures on the 1-Hz perfusion-index envelope around each charted cuff timestamp. The primary endpoint was the proportion of cuff cycles suitable for analysis that were detector-positive at a 15-second reperfusion threshold, with 95% confidence intervals estimated by resampling patients at a fixed seed. A secondary analysis used a locally hosted multimodal language model (a Gemma-3 derivative on a non-device server) to adjudicate the same signature on perfusion-index plots; no MIMIC-IV-WDB content left the workstation. Results. Of 9,224 charted cuff cycles, 8,909 had a usable pulse-oximeter waveform, and 268 cycles in 15 patients (4.30% of the 6,236 cuff cycles suitable for analysis, 95% CI 2.60 to 6.03) met the primary 15-second threshold. The language model adjudicated the same cycles and called 1,367 of the 8,909 cycles with a usable waveform (15.34%) signature-present, roughly five times the detectors count. Because no laterality ground truth exists, agreement with a single blinded reader served as the comparator rather than accuracy. The two methods were about equally concordant with the reader: precision was 0.25 (95% CI 0.14 to 0.39) for the detector and 0.24 (95% CI 0.10 to 0.35) for the language model, although reweighting to the full population of cycles with a usable waveform lowered the language model to 0.030 (95% CI 0.009 to 0.053). These estimates are reference-limited: a blinded re-read of a 150-card subsample showed only moderate intra-rater reliability (Cohen {kappa} 0.46 to 0.59) with systematic undercalling on the first pass, and rescoring against the corrected re-read roughly doubled precision for both methods. Conclusions. Opportunistic extraction of capillary refill-like signals from archived ICU pulse oximetry is limited in two distinct ways. First, sensor geometry limits how often the signal is recordable: cuff cycles rarely show the morphology expected of a same-limb cuff and probe pair, consistent with opposite-limb placement, so the bottleneck is geometry rather than signal processing. Second, the modest reliability of morphology adjudication limits how well any single flagged cycle can be confirmed: against a blinded reader the detector is a usable screen but a noisy confirmer, the reference is itself only moderately reliable, and the language model is no more concordant despite flagging many more cycles. The minority of cycles in which the morphology appears contain a candidate signal that may merit prospective study under controlled placement with laterality recorded.
Basilakis, A.; Duenser, M. W.
Show abstract
Background: The Therapeutic Distance framework (Paper 1) achieved AUC 0.61 for orbit-based mortality prediction in 11,627 sepsis patients. We hypothesised that incorporating state-dependent parameter relevance would substantially improve prediction. Methods: We extended the framework to 84,176 ICU patients from MIMIC-IV v3.1 across 16 clinical syndromes. Validation included full-population leave-one-out (n=59,362), head-to-head comparison against SAPS-II and logistic regression on 34,467 matched patients with bootstrap confidence intervals, temporal validation, outcome permutation, sensitivity analysis, and calibration assessment. Results: Full-population leave-one-out achieved AUC 0.832 (n=59,362). On 34,467 matched patients, Therapeutic Distance (AUC 0.841) significantly outperformed both SAPS-II (0.786; delta=+0.055, 95% CI +0.048 to +0.061, p<0.001) and logistic regression (0.788). Temporal validation showed stable performance (delta=-0.006). Outcome permutation confirmed genuine signal (AUC 0.859 to 0.498 with shuffled mortality). Sensitivity analysis demonstrated near-zero variation (delta 0.0006-0.003). The framework performed well for 8 of 16 syndromes (AUC >0.70) and failed for DKA and post-cardiac surgery (AUC <0.40). Conclusions: Therapeutic Distance provides therapy-specific risk stratification that exceeds both established severity scores and standard machine learning while remaining robust to hyperparameter choices, temporal drift, and outcome permutation.
Brandao Raskin, M.; Karhu-Leperd, I.; Harris, C. W.; Pirrachio, R.; Lascarrou, J. B.; Stevens, R. D.
Show abstract
ObjectivesTo determine whether heterogeneous treatment effects (HTE) explain the inconclusive results of targeted temperature management (TTM) trials after cardiac arrest, using causal machine learning across four datasets. DesignSecondary analysis of one multicenter RCT and three observational ICU cohorts using S-learner and forest-based R-learner models to estimate conditional average treatment effects (CATE). SettingTwenty-six French ICUs (HYPERION), approximately 200 U.S. ICUs (eICU-CRD), Johns Hopkins Hospital (PMAP), and Beth Israel Deaconess Medical Center (MIMIC-IV). PatientsAdults ([≥]18 years) with cardiac arrest; 4,507 patients across the four datasets, of whom 1,814 (40.2%) received TTM. InterventionsTTM as administered clinically or per HYPERION protocol. Ascertainment: randomization (HYPERION), treatment documentation (eICU-CRD), sustained hypothermia <36{degrees}C for >12 hours (PMAP), or documented cooling device use [≥]12 hours (MIMIC-IV). Measurements and Main ResultsThe primary outcome was hospital mortality; the secondary outcome was favorable neurologic function (Cerebral Performance Category 1-2 at 90 days for HYPERION; last motor Glasgow Coma Scale = 6 for observational cohorts). Three S-learner models (XGBoost, neural network, Bayesian Additive Regression Trees) and one forest-based R-learner (CausalForestDML) estimated CATE. HTE was assessed by likelihood-ratio tests for CATExtreatment interaction, CausalForestDML 95% confidence intervals, Group Average Treatment Effects (GATES) across CATE quintiles, and SHAP feature importance. S-learner discrimination was adequate (AUROC 0.72-0.82). No model showed a significant CATExTTM interaction in any dataset (all p > 0.05). Individual CATE confidence intervals uniformly crossed zero, and GATES showed no monotonic gradient of benefit across quintiles in any dataset. ConclusionsAcross four diverse datasets and multiple causal machine-learning approaches, we found no evidence of heterogeneous treatment effects for TTM after cardiac arrest. The inconclusive findings of TTM trials are unlikely explained by differential effects in identifiable subgroups defined by routinely available clinical features. KEY POINTSQuestion: Do identifiable patient subgroups derive differential benefit from targeted temperature management (TTM) after cardiac arrest? Findings: In a causal machine-learning analysis of 4,507 patients across one randomized trial and three observational ICU cohorts, no model detected significant heterogeneous TTM effects on mortality or neurologic outcome. Meaning: Conflicting TTM trial results are unlikely explained by differential effects in identifiable subgroups, weakening the rationale for personalized TTM strategies based on routinely available clinical features.
Brown, R.-A.; Bonavia, A. S.
Show abstract
BackgroundImmune dysfunction in sepsis and critical illness is biologically heterogeneous, yet available stratification frameworks leave many patients unclassified. We hypothesized that ex vivo cytokine-induction responses would define a continuous axis of functional immune responsiveness and identify a low-response state enriched in sepsis. MethodsIn this prospective observational study, 39 critically ill adults enrolled within 48 hours of ICU admission and 6 healthy controls underwent standardized whole-blood stimulation with lipopolysaccharide, anti-CD3/anti-CD28 antibodies, and PMA/ionomycin, with selected wells additionally supplemented with interleukin-7 or granulocyte-macrophage colony-stimulating factor. Interleukin-6, tumor necrosis factor, and interferon-gamma responses were quantified and referenced to subject-specific unstimulated baselines. A patient-anchored primary feature matrix was used to derive a continuous immune axis by principal component analysis, and a cross-validated 5-feature MiniResponder score was developed as a portable summary measure. ResultsAmong critically ill patients, induced cytokine responses organized along a dominant continuous axis of functional immune responsiveness; the first principal component explained 53.3% of between-patient variance. MiniResponder captured this axis and showed a lower-shifted distribution in sepsis. Using a control-referenced threshold defined by the 10th percentile of the healthy-control distribution, 19 of 39 patients (48.7%) were classified as low-response, including 15 of 21 patients with sepsis (71.4%) and 4 of 18 critically ill patients without sepsis (22.2%) (odds ratio 8.75, Fisher exact P=0.004). In exploratory analyses, lower MiniResponder scores were associated with greater unadjusted improvement in Sequential Organ Failure Assessment score from day 1 to days 3-9 (rho=-0.33; P=0.046), but this association attenuated after adjustment for baseline SOFA score (beta=-0.10; 95% CI-0.36 to 0.27). ConclusionsEx vivo immune profiling identified a continuous patient-anchored axis of functional immune responsiveness in critical illness that can be summarized by a compact 5-feature score. A control-referenced low-response state was enriched in sepsis. This framework may complement existing biomarker-based stratification approaches and support future enrichment strategies in sepsis trials.
Navalkar, K. A.; Garnacho-Montero, J.; Canton-Bulnes, M. L.; Garcia-Garmendia, J. L.; Estella, A.; Fernandez-Galilea, A.; Blanco, I.; Estecha-Foncea, M. A.; Gordillo-Resina, M.; Rodriguez-Gomez, J.; Pineda-Capitan, J. J.; Martinez-Fernandez, C.; Escoresca-Ortega, A.; Amaya-Villar, R.; Mora-Ordonez, J.; Gonzalez-Soto, S.; Gutierrez-Pizarraya, A.; Balk, R.; Miller, R. R.; Burke, J. P.; Patel, G.; Parada, J. P.; Schultz, M. J.; Scicluna, B. P.; Blodget, E.; Kumar, S.; Sampson, D.; Yager, T. D.; Davis, R. F.; Cermelli, S.; Brandon, R. B.
Show abstract
Background: Accurate early identification of sepsis remains a major clinical challenge due to its heterogeneous presentation and overlap of clinical signs with the non-infectious systemic inflammatory response syndrome (SIRS). Timely differentiation is crucial for improving patient outcomes, meeting sepsis bundle requirements and reducing inappropriate antimicrobial use. We hypothesized that clinical and laboratory data available within the first 3 hours of patient presentation could be used to identify patients with sepsis to an actionable level of accuracy, in lieu of traditional microbiology results which would not become available until at least 12-24 hours. Data from two independent studies were used to quantify the diagnostic value of demographic, vital, clinical-laboratory, and microbiological data available at three time points for distinguishing retrospectively diagnosed critically ill patients with either sepsis or non-infectious SIRS. A particular focus of this work was an assessment of the utility of SeptiCyte RAPID (Immunexpress Inc., Seattle, Washington, USA) as an aid to sepsis diagnosis, producing actionable data within 1 hour. Methods: Data from two independent study cohorts were analysed. The 510k cohort consisted of 419 adult patients in intensive care (ICU) (MARS, VENUS, and NEPTUNE trials). The Andalusian cohort consisted of 353 ICU patients from the PANGEA study. Logistic regression models, selected by a greedy search algorithm and validated by repeated cross-validation, were used to determine the contributions of different variables to diagnostic accuracy. Diagnostic performance was quantified by area under the receiver operating characteristic curve (AUC). Results: For the 510k cohort, a baseline AUC of 0.69-0.73 was observed using 5-7 vital and demographic variables assessed immediately upon presentation (time T1). The addition of clinical-laboratory variables, in particular SeptiCyte RAPID, within 1-3 hours post-presentation (time T2) increased the AUC to 0.83-0.85). Finally, the addition of microbiological data 12-24 hours post-presentation (time T3) further improved the AUC to 0.90-0.91. Similar results were obtained for the Andalusian cohort. AUC values at the three time points were as follows: At time T1, AUC = 0.67 based solely on vital signs and demographics; at time T2, AUC = 0.87 based on vitals + demographics + SeptiCyte RAPID or other clinical laboratory data; at time T3, AUC = 0.93 based on vitals + demographics + SeptiCyte RAPID or other clinical laboratory data + microbiology results). For both cohorts, the most significant variables included temperature, mean arterial pressure, respiratory rate, suspected infection site; SeptiCyte RAPID, procalcitonin, confirmed bacterial infection and positive blood culture confirmation. Conclusions: Accuracy of identification of sepsis increases markedly as demographics and vital signs are supplemented with clinical-laboratory information, and ultimately with microbiological culture results. The fastest improvement occurs within the first 3 hours when laboratory data, and in particular SeptiCyte RAPID results, become available. Integrating rapid host-response testing with SeptiCyte RAPID into time-based diagnostic frameworks may enhance early sepsis recognition, improve antimicrobial stewardship, and support guideline-driven clinical decisions.
Navalkar, K. A.; Wani, P.; Davis, R. F.; Cermelli, S.; Dietrich, M.; von der Forst, M.; Becker, S. L.; Benthien, S.; Baumann, E.; Zeiner, C.; Lepper, P. M.; Garnacho-Montero, J.; Canton-Bulnes, M. L.; Fernandez-Galilea, A.; Luis Garcia-Garmendia, J. L.; Estella, A.; Miller, R. R.; Schultz, M. J.; Rothman, R.; Burke, J.; Patel, G.; Parada, J.; Yager, T. D.; Brandon, R. B.
Show abstract
Overview: SeptiCyte RAPID is an FDA-cleared gene expression test that quantifies host immune response to aid in the diagnosis of sepsis. The test yields a score (the SeptiScore) ranging from 0-15, distributed across four bands (1-4) based on increased likelihood of sepsis. Each band can be characterized by average positive and negative likelihood ratios (LR+, LR- respectively) for the discrimination of sepsis versus the non-infectious systemic inflammatory response syndrome (SIRS). Methods: A retrospective analysis of prospectively collected data from a combined cohort of critically ill patients suspected of sepsis (N=889), recruited across 19 hospitals in the USA and Europe. The analysis quantified the LR+ and LR- parameters as a function of SeptiScore, for discrimination of sepsis vs. SIRS in patients admitted to ICU. Hypotheses: (1) The likelihood ratio (LR) framework provides a clinically useful interpretive approach that complements the previously used SeptiScore banding scheme; (2) Low Band 1 SeptiScores are associated with sufficiently small LR- to support the use of SeptiCyte RAPID as a rule-out test for sepsis; (3) High Band 4 SeptiScores are associated with sufficiently large LR+ to support the use of SeptiCyte RAPID as a rule-in test for sepsis; and (4) SeptiScore-derived LR+ and LR- values can be combined with estimates of pre-test probability (derived from patient characteristics and/or other diagnostic tests) to generate individualized, patient-specific post-test probabilities of sepsis. Results: The SeptiCyte RAPID test demonstrates strong diagnostic performance in distinguishing sepsis from SIRS. The likelihood ratios across different score bands provide clear clinical utility: the median LR+ was 3.26 (range 2.57-4.24) for Band 3, and 6.97 (range 4.35-15.57) for Band 4 providing evidence toward ruling in sepsis at high SeptiScores. Conversely, the median LR- was 0.16 (range 0.14-0.20) for Band 2 and 0.085 (range 0.014-0.16) for Band 1, providing evidence toward ruling out sepsis at low SeptiScores. A higher-resolution analysis of SeptiCyte RAPID performance confirmed these trends by evaluating LR+ and LR- at specific values within each band. The sepsis group was further stratified according to whether patients were classified as blood-culture positive (BC+) or blood culture negative (BC-), and the detailed LR+ and LR- analyses were repeated. A monotonic increase in likelihood ratio with increasing SeptiScore was consistently observed, independent of whether sepsis patients were culture-positive, culture-negative, or unstratified with respect to blood culture status. Conclusion: High SeptiScores have correspondingly high LR+ values, and low SeptiScores have correspondingly low LR- values, both of which may have clinical utility. High likelihood ratios for band 4 SeptiScores, which precede traditional microbiology results, may provide clinicians with early confidence of a sepsis diagnosis and microbiology diagnostic stewardship. Low likelihood ratios for band 1 SeptiScores may prompt clinicians to consider an alternate diagnosis to sepsis. Such results, obtained early in the diagnostic workup process, may lead to fewer missed diagnoses and more efficient use of hospital resources.
Wiseman, J.; Sibley, S.; Perez-Patrigeon, S.; Mekhaeil, M.; Hanley, M.; Hunt, M.; Boyd, T.; Grant, B.; Boyd, J. G.
Show abstract
IntroductionThere is increasing interest in the peripheral administration of vasopressors for two main reasons: (1) to expedite vasopressor initiation in patients with refractory shock and (2) to avoid the potential complications associated with central venous catheter placement. The current evidence on the use of peripheral vasopressor administration is primarily based on single-center observational studies. There are inconsistencies in the administration of peripheral vasopressors, including catheter gauge and location, monitoring practices, vasopressor concentrations, and duration of use. This has made it difficult for institutions to develop best practice guidelines. A randomized controlled trial is needed to address this knowledge gap. Methods and analysisThe Peripheral Use of Low-dose Vasopressors for Safety and Efficacy (PULSE) in the intensive care unit is a prospective, unblinded feasibility study. Eligible patients will be 18 years or older, have no existing central venous catheter or peripherally inserted central catheter and have the presence of shock requiring a minimum vasopressor dose of any of the following: norepinephrine 0.0625 mcg/kg/min, phenylephrine 0.625 mcg/kg/min, and epinephrine 0.0625 mcg/kg/min. Fifty patients will be randomized 1:1 into either the peripheral venous catheter or central venous catheter group. The primary outcome is feasibility, defined as (1) a recruitment rate of 4 participants per month, (2) a data capture rate of [≥]90%, and (3) a <50% conversion rate from peripheral to central access. The secondary outcomes include the safety of peripheral vasopressor use, alive and central-line-free days, the number of attempts needed to place a catheter, volume status, in-hospital mortality rate, ICU and hospital length of stay, and patient-centred important outcomes. ImplicationsThe data collected from this study will inform the design of a definitive randomized controlled trial to assess the safety and efficacy of protocol-driven peripheral vasopressor administration. Ethics and disseminationThis study received approval (6042888) from the Queens University Health Sciences/Affiliated Teaching Hospitals Research Ethics Boards. Results of this study will be presented at critical care conferences and submitted for publication. Trial registration numberNCT06920173 (https://clinicaltrials.gov/study/NCT06920173).
Coupland, L. A.; Frost, S. A.; Lin, J.; Pham, N.; Suryana, E.; Self, M.; Chia, J.; Lam, T.; Liu, Z.; Jaich, R.; Crispin, P.; Rabbolini, D.; Law, R.; Keragala, C.; Medcalf, R.; Aneman, A.
Show abstract
Rationale: Fibrinolysis resistance in sepsis associates with thrombotic burden, multi-organ failure and death. The degrees and dynamics of resistance that associate with mortality in acute sepsis are unknown, and a simple tool to aid clinician interpretation of fibrinolysis measurements is lacking. Objectives: To establish a point of care grading tool of fibrinolysis resistance that aligns with scoring systems for disease acuity, is substantiated by plasma fibrinolysis markers and enables rapid investigation of the fibrinolysis state at the point of care. Methods: Prospective observational study of 116 adult sepsis/septic shock patients with sequential measurements of fibrinolysis resistance during Intensive Care Unit (ICU) admission using tissue plasminogen activator (tPA) enhanced viscoelastic testing (VET). The clot lysis time (TPA-LT) adjusted for fibrin clot amplitude (TPA-LT/FIBA10, sec/mm) underwent cluster analysis and was evaluated against disease severity scores, standard pathology, clinical outcomes and fibrinolysis markers. Measurements and Main Results: Three clusters of progressively increasing fibrinolysis resistance were identified (Grades 1-3). At admission, Grade 3 associated with the highest disease severity, organ failure, haematological and biochemical perturbations, fibrinolysis marker inhibitory profile and mortality (42% versus 24% and 15% in Grade 2 and Grade 1, respectively) with a 3.9-fold [95% CI 1.4-11] increased hazard ratio for death at 28 days compared to Grade 1. Transitions between grades were frequent over 7 days with a reduced Grade associated with decreased risk of death. Conclusions: Grading of fibrinolysis resistance in sepsis enables rapid identification of patients at greatest mortality risk with any dynamic improvement corresponding to favourable clinical outcomes.
Meza-Fuentes, G.; Delgado, I.; Barbe, M.; Sanchez-Barraza, I.; Filippini, D.; Smit, M. R.; Sinnige, J. S.; Kramer, L.; Smit, J.; Jonkman, A.; Meade, M.; Retamal, M. A.; Lopez, R.; Bos, L. D. J.
Show abstract
Background Acute respiratory distress syndrome (ARDS) is characterised by substantial physiological heterogeneity, which contribute to a very variable clinical outcomes and therefore inconsistent responses to ventilatory strategies. We aimed to externally validate physiological ARDS subphenotypes previously identified using routine ventilatory and gas-exchange variables, assess their prognostic relevance across independent cohorts, and examine heterogeneity of treatment effect according to PEEP strategy. Methods Unsupervised Gaussian Mixture Modelling was used to identify physiological subphenotypes based on ventilatory mechanics and gas-exchange parameters. Labels were subsequently used to train and validate supervised classifiers using XGBoost. Prognostic relevance was assessed across three independent cohorts, including two randomised controlled trials (ALVEOLI and LOVS). Predictive enrichment for PEEP strategy was evaluated using individual patient data from ALVEOLI and LOVS (n = 1,532) using intention-to-treat analyses, applying both one-stage and two-stage fixed-effects IPD meta-analytic approaches to test for interaction between physiological subphenotype and PEEP strategy. Results Two distinct physiological subphenotypes, termed Efficient and Restrictive, were replicated across independent cohorts. Across each cohort, patients classified as Restrictive consistently exhibited higher all-cause 28-day mortality compared to Efficient patients. When pooled across studies, the Restrictive subphenotype was associated with a significantly increased risk of death (pooled odds ratio 1.75, 95% CI 1.36-2.24), with no evidence of between-study heterogeneity. Predictive analyses showed a statistically significant interaction between physiological subphenotype and PEEP strategy in the one-stage IPD model (p for interaction = 0.037), with concordant findings in the two-stage fixed-effects IPD meta-analysis (interaction OR 1.91, 95% CI 1.00-3.66; I2 = 0%). Higher PEEP was associated with increased mortality in Efficient patients and reduced mortality in Restrictive patients, indicating effect modification by physiological subphenotype. Interpretation Physiological ARDS subphenotypes derived from routinely collected bedside data provide robust and externally validated prognostic stratification across observational and randomised trial cohorts. The observed interaction with PEEP strategy suggests that underlying physiological profiles may influence treatment response, supporting the concept that physiology-based be a starting point for personalized medicine and therefore better ventilatory strategies in future clinical trials.
Basilakis, A.
Show abstract
Background: Patient matching in intensive care databases yields sample sizes too small for individualised outcome analysis. Current AI systems provide population-level guideline summaries but omit stratification variables that may invert therapy signals at the individual level. Methods: We developed the Therapeutic Distance framework, which computes the z-standardised distance between a patient's clinical parameters and the centroid of MIMIC-IV patients who received each therapy: d(P,T) = sum of wi(T) x |(Li - mui(T)) / sigmai|. We hypothesise that patients at the same distance to a therapy (same orbit) have comparable outcomes. Six validation experiments were performed on 11,627 sepsis patients (SAPS-II 30-80) from MIMIC-IV v3.1. Results: Echo-stratified vasopressin recipients showed mortality of 30.1% (n=146, 95% CI 22.6-37.7%) versus 53.9% without echo (n=2,426, 95% CI 51.9-55.9%). Confidence intervals did not overlap (bootstrap, 1,000 resamples). However, echo-stratified patients had lower general severity (SAPS-II 49.2 vs 53.9) but higher cardiac biomarkers (troponin 1.0 vs 0.51 ng/mL), indicating that the observed difference is compatible with both severity confounding and a possible cardiac-specific vasopressin effect. Leave-one-out prediction with uniform weights achieved AUC 0.61 as a structural baseline. Conclusions: Therapeutic Distance replaces patient matching with orbit matching, substantially increasing usable sample sizes. The echo-vasopressin finding is hypothesis-generating and mechanistically plausible but not causally proven. The framework is intended as a clinical decision support signal under uncertainty, not as a causal inference method.
Berg, N. K.; Kerchberger, V. E.; Pershad, Y.; Corty, R. W.; Bick, A. G.; Ware, L. B.
Show abstract
RationaleSepsis is a life-threatening syndrome causing significant morbidity and mortality especially in the aging population. Clonal hematopoiesis of indeterminate potential (CHIP) is an age-related condition of clonal expansion of hematopoietic stem cells harboring somatic mutations associated with increased incidence of chronic illness and all-cause mortality. ObjectiveEvaluate the association of pre-illness CHIP with mortality and morbidity in patients admitted to the ICU with sepsis. MethodsWe performed a retrospective study using a de-identified electronic health record linked with a DNA biorepository. We identified adult patients with sepsis who had DNA collected prior to ICU admission. We tested the association between CHIP status, determined from whole-genome sequencing, and ICU mortality, organ support-free days, and long-term survival adjusting for age, sex, race and Sequential Organ Failure Assessment (SOFA) score on ICU admission. Measurements and Main ResultsPre-illness CHIP was associated with increased sepsis mortality (OR = 1.54, 95% CI 1.13 to 2.07, P = 0.005) and fewer days alive and free of organ support (-1.7 days, 95% CI -3.2 to -0.2, P = 0.028) after adjusting for age, sex, race, and SOFA score. In sepsis survivors, CHIP was also associated with increased long-term mortality after discharge (HR 1.40, 95% CI 1.01 to 1.93, P = 0.041). ConclusionsPre-illness CHIP was independently associated with increased mortality and morbidity in critically-ill adults with sepsis. These findings suggest that CHIP is a risk factor for sepsis severity. Elucidating the mechanism underlying this association could uncover new therapeutic interventions for sepsis.
Sines, B. J.; Hagan, R. S.; Jiang, X.; Pavlechko, E.; McClain, S.; Hunt, X.; Florou-Moreno, J.; Acquardo, J.; Risa, G.; Valsaraj, V.; Schisler, J. C.; Wolfgang, M. C.
Show abstract
Objective: To develop a workflow that transforms electronic health record data into machine learning-ready features for molecular endotype assignment and to evaluate whether clinician-informed feature engineering improves model performance and interpretability. Materials and Methods: We developed parallel clinician-informed and clinician-agnostic feature engineering pipelines to prepare raw EHR data from mechanically ventilated patients with respiratory failure. Molecular endotype labels derived from paired deep lung and blood profiling of subjects with acute lung injury were used to train candidate machine learning classifiers. Champion models from each pipeline were compared on predefined performance metrics. Results: Bayesian network classifiers were the top-performing models in both pipelines. The clinician-informed pipeline generated fewer features than the clinician-agnostic pipeline (645 vs 1,127) and produced a lower misclassification rate in the final Bayesian network model (0.047 vs 0.14). In an independent cohort of subjects with acute lung injury, the clinician-informed model better distinguished corticosteroid-responsive from non-responsive subgroups. Discussion: Clinical context improved feature engineering efficiency, model interpretability, and classification performance. These findings support the integration of domain expertise into machine learning workflows intended for critical care implementation. Conclusions: Clinician-informed feature engineering can simplify machine learning models while improving performance and preserving clinical relevance. AI tools developed for healthcare should incorporate subject matter expertise early in the feature engineering and analytic workflow.
Sines, B.; Hagan, R.; Jiang, X.; Pavlechko, E.; McClain, S.; Hunt, X.; Florou-Moreno, J.; Acquadro, J.; Risa, G.; Valsaraj, V.; Schisler, J.; Wolfgang, M. C.
Show abstract
ABSTRACT Background: Corticosteroids reduce mortality in severe COVID-19 requiring oxygen or invasive mechanical ventilation, yet emerging data suggest that SARS-CoV-2-associated acute lung injury is biologically heterogeneous and that treatment response may vary across molecularly defined disease states. Lung-derived molecular endotypes of severe COVID-19-associated acute lung injury have been described, but direct molecular profiling is not routinely available at the bedside. We evaluated whether a clinical predictor of previously defined lung molecular endotype identifies heterogeneity in corticosteroid treatment effect among mechanically ventilated patients with COVID-19. Methods: We utilized a single-center cohort of 5,000 patients with COVID-19 treated at the University of North Carolina Hospital between January 1, 2020, and December 31, 2022, to emulate a target trial assessing the effect of corticosteroid receipt on mortality, length of stay, and incident organ support. Confounding was addressed through inverse probability of treatment weighting (IPTW). Outcomes for severely ill patients requiring mechanical ventilation were compared to the RECOVERY trial results, with subsequent moderation analysis and stratified analysis by clinically predicted lung molecular endotype and vaccination status. The primary outcome was 28-day mortality. Secondary Outcomes were time to discharge alive and progression to additional organ support. Results: This emulated target trial showed a directionally favorable but non-statistically significant association between corticosteroid treatment and reduced 28-day mortality in patients requiring mechanical ventilation for SARS-CoV-2 infection. A clinical predictor of lung molecular endotype moderated the effect of corticosteroids on 28-day mortality (p-value for interaction 0.038) and identified distinct predicted endotype-specific treatment effect. Corticosteroid treatment was associated with lower 28-day mortality in the predicted Hyper-Inflammatory endotype (OR 0.62, 95% CI 0.39, 0.99) but not in the predicted Metabolic Dysregulation endotype (OR 1.15, 95% CI 0.82, 1.61). We did not detect significant effect modification by vaccination status (p-value for interaction 0.65), although inference was limited by the small, vaccinated subgroup (28-mortality OR 0.78, 95% CI 0.37, 1.65 in vaccinated vs 0.94, 95% CI 0.70, 1.26 in unvaccinated). Conclusions: In this target trial emulation of mechanically ventilated patients with severe COVID-19, corticosteroid treatment showed a directionally favorable but non-statistically significant association with reduced 28-day mortality in the overall cohort. However, a clinical predictor of lung molecular endotype identified significant heterogeneity in treatment effect, with benefit concentrated in the predicted Hyper-Inflammatory endotype and no apparent benefit in the predicted Metabolic Dysregulation endotype. These findings support prospective validation of clinically deployable endotype-guided corticosteroid treatment strategies in acute lung injury and ARDS.
Haque, F.; Hasan, M.
Show abstract
Purpose: Polypharmacy is highly prevalent among critically ill patients, yet it's independent impact on intensive care unit (ICU) outcomes in sepsis remains critically unexplored. We aimed to evaluate whether pre-admission polypharmacy independently predicts ICU mortality and provides incremental prognostic value using the medication reconciliation module of the MIMIC-IV-ED linked database. Materials and Methods: We conducted a retrospective cohort study of 3,347 adults admitted to the ICU who met Sepsis-3 criteria. Pre-admission polypharmacy was categorized as none (0-4), standard (5-9), or high (>=10 medications). Multivariable logistic regression, propensity score matching, and reclassification analyses (NRI/IDI) were performed. The primary outcome was in-hospital ICU mortality. Results: High polypharmacy was present in 58.9% of patients. Crude ICU mortality increased sequentially: 18.5% (none), 26.0% (standard), and 27.5% (high; p < 0.001). After multivariable adjustment, high polypharmacy independently predicted in-hospital ICU mortality (aOR 1.45, 95% CI (1.10-1.91)), and 28-day mortality (aOR 1.47). Drug-class analysis identified statins as significantly protective (aOR 0.56), whereas RAS blockers combined with diuretics increased acute kidney injury risk (aOR 1.49). Propensity matching confirmed the primary mortality association (matched aOR 1.28). Conclusions: By utilizing the ED medication reconciliation table, this study proves high polypharmacy represents a distinct 'pharmacologic frailty', independent of acute severity. Available instantly at triage, this zero-latency metric provides significant early prognostic value (SOFA NRI = 0.24) and identifies actionable high-risk interactions (e.g., RAS blockers plus diuretics) for immediate, targeted pharmacist-led intervention upon ICU admission.
Kuriyama, A.; Heels-Ansdell, D.; Fernando, S. M.; Adhikari, N. K.; Lamontagne, F.; Teja, B.; Lewis, K. A.; Rochwerg, B.; Carayannopoulos, K. L.; Vazquez-Grande, G.; McIntyre, L.; Honarmand, K.; Chaudhuri, D.; Krag, M.; Zytaruk, N.; Cook, D. J.; Canadian Critical Care Trials Group,
Show abstract
BackgroundSepsis is a recognized risk factor for upper gastrointestinal bleeding, yet sepsis-specific randomized evidence informing stress ulcer prophylaxis remains limited. ObjectiveTo describe the rationale, methods, and statistical analysis plan for a post hoc subgroup analysis evaluating pantoprazole versus placebo in invasively ventilated critically ill adults with septic shock enrolled in the REVISE trial (NCT03374800). MethodsThis study will be a post hoc extended subgroup analysis of the international, blinded, randomized REVISE trial, which enrolled 4,821 mechanically ventilated adults in 68 ICUs across 8 countries. Patients were randomized to intravenous pantoprazole 40 mg once daily or placebo during invasive mechanical ventilation. Septic shock will be defined as receipt of vasopressors or inotropes at baseline together with an admitting diagnosis of infection according to APACHE III diagnostic categories. ResultsThe primary efficacy outcome will be clinically important upper gastrointestinal bleeding in the ICU within 90 days after randomization, and the primary safety outcome will be all-cause mortality within 90 days. Additional trial outcomes will include patient-important upper gastrointestinal bleeding, ventilator-associated pneumonia, Clostridioides difficile infection during hospitalization, new renal replacement therapy, mortality in the ICU and hospital, and duration of ICU and hospital stay. Analyses will be adjusted for prehospital acid suppression; the mortality analyses will be additionally adjusted for APACHE II score. ConclusionThis protocol and statistical analysis plan describes an evaluation of the efficacy and safety of pantoprazole in patients with septic shock within a large randomized trial dataset.
Arshad, A.; Carey, K. A.; Daniels, L. A.; Jani, P.; Gilbert, E.; Sanchez-Pinto, L. N.; Mayampurath, A.
Show abstract
Objective: Readmissions to the PICU are associated with increased morbidity and mortality. A prediction model that can identify children at risk of readmission at the time of transfer can allow providers to intervene and potentially improve patient outcomes. The objective of this study was to derive and validate machine learning models to predict PICU readmission at the time of transfer. Design: Retrospective observational cohort study Setting: Three quaternary care PICUs in the city of Chicago Patients: All children admitted to the PICU between 2012 and 2019. Measurements: The primary outcome was unplanned readmission to the PICU within 48 hours of transfer to the inpatient ward. Predictor variables included vital signs, patient characteristics, and laboratory results. We developed and externally validated four models to predict PICU readmission: logistic regression, elastic net, random forest, and XGBoost. Main Results: This study included 35,601 patients, with readmission rates ranging from 2.2-3.7% by site. The performance of models during internal validation was consistent at the three sites, with the area under the receiver operating characteristic (AUC) values between 0.70 and 0.73 and no difference across the four models. Model performance decreased significantly during external validation (AUCs of 0.60-0.69). The variables most important to the prediction differed at each site. Conclusion: Machine learning models for predicting readmissions to the PICU have limited generalizability. Locally derived models demonstrated modest performance in our study and could potentially inform provider decision-making if prospectively validated. Externally developed models are unlikely to perform well at predicting PICU readmissions.
Landry, T. C.; Kim, Y.
Show abstract
Background. Capillary refill time, an examiner-dependent bedside test of distal microvascular perfusion, has become a resuscitation target in septic shock,1,2,3,4 motivating a continuous surrogate computed from the photoplethysmogram (PPG, the optical waveform the pulse oximeter on every ICU patient already records).5,6,7,8 Objective. We attempted three PPG-derived candidate measures on the MIMIC-IV Waveform Database (MIMIC-IV-WDB v0.1.0) and asked, by inspecting randomly drawn examples, whether each captured its intended physiology before any downstream modeling. Methods. MIMIC-IV-WDB v0.1.09 was linked to MIMIC-IV.10 The signals were a cuff-anchored perfusion-index recovery (reactive hyperemia when the cuff shares an arm with the probe), a slow Mayer-wave-band power ratio of the perfusion index (sympathetic vasomotor tone), and a per-beat diastolic exponential decay time constant (a refill-like recovery time). For each signal we drew 10 random examples at a fixed seed and checked them against a checklist fixed in advance. Each was read by the author and, separately, by MedGemma 1.5, a multimodal medical language model run locally. A synthetic test with a known time constant checked the third signal. Results. The cuff-anchored signal showed the expected occlusion-reperfusion shape on 268 of 6,236 evaluable cuff cycles (4.30%) in 15 of 19 patients, consistent with opposite-limb placement of the probe and cuff. The slow-band ratio returned a stable cohort value, but a clear, stationary peak appeared in only4 of 10 random windows. The per-beat fit met its goodness-of-fit threshold in 10 of 10 beats, yet a cardiac-frequency heuristic flagged a possible fit on the heart-rate oscillation in 7 of 10, and in 5 of 17 patients the time constant lay where an exponential is indistinguishable from a straight line. A 0.5Hz high-pass pre-filter implanted its own approximately 318 ms time constant regardless of truth. The language model tracked the human on clear positives but reported the pattern present on every call it returned, never absent. Conclusions. Two of the three candidate signals did not reflect their intended physiology in most examples, and the third was constrained by sensor placement. Inspecting a few random raw inputs against a checklist written in advance is an inexpensive upstream check before downstream inference on PPG-derived microvascular signals.
Caraballo, C.; Victoria-Castro, A. M.; Rali, A. S.; Hall, E. J.; Safiriyu, I.; Katz, J. N.; Gage, A.; Notarianni, A. P.; Dudzinski, D. M.; Alviar, C. L.; Tavazzi, G.; Miller, P. E.
Show abstract
Background: The importance of lactate trajectory during the first day of cardiogenic shock is increasingly recognized. We aimed to assess the association between admission-day lactate trajectory and in-hospital mortality, and to identify same-day interventions predictive of lactate clearance. Methods: We analyzed adult patients admitted with cardiogenic shock between October 2015 and June 2023, using the Vizient(R) Clinical Data Base. Early lactate clearance was defined as lactate <2.5 mmol/L by the end of the admission day. We used multivariable logistic regression to assess the association between lactate change and in-hospital mortality, and to identify interventions associated with lactate clearance. Results: Among 40,434 patients with cardiogenic shock, 30.1% achieved same-day lactate normalization, which was associated with lower in-hospital mortality (aOR 0.51; 95% CI 0.48-0.54). Lactate change showed the greatest prognostic importance, with observed mortality exceeding 80% among those with lactate increase >5 mmol/L regardless of baseline values. After adjustment, lactate change showed a positive exponential relationship with mortality, with aORs ranging from 0.25 (95% CI 0.23-0.27) for a -10 mmol/L change to 3.99 (95% CI 3.58-4.40) for a +10 mmol/L change. The intervention most strongly associated with early lactate clearance was pulmonary artery catheter (PAC; aOR 1.28 [95% CI 1.19-1.37]). Conclusions: Nearly 1 in 3 patients with cardiogenic shock achieved early lactate clearance, which was associated with lower mortality. The magnitude of lactate change had profound prognostic implications regardless of the initial value. Among day 1 interventions, PAC use had the strongest association with lactate clearance.
Collier, A.
Show abstract
Background Electronic health record documentation patterns may reflect workflow complexity, monitoring intensity, and operational strain in intensive care settings. However, documentation-derived features can be sensitive to local documentation culture, data capture systems, and outcome definitions. Retrospective validation across multiple datasets is therefore needed before these signals are used in workflow intelligence or clinical AI governance tools. Objective To evaluate whether documentation-density and documentation-timing features show reproducible retrospective signal for ICU workflow complexity and long-stay proxy outcomes across de-identified critical care datasets, while distinguishing workflow and long-stay associations from unsupported claims about mortality prediction, burden reduction, or deployment readiness. Methods We synthesized retrospective validation results from de-identified ICU and workflow datasets generated through a prespecified documentation-density validation program. Feature families included Documentation Burden Score style features, Shift-End Documentation Rate style features, documentation reliability style metadata, and all-documentation feature sets where available. Outcomes included long ICU length of stay proxies, mortality where available, and workflow proxy endpoints. Models compared baseline feature sets with enhanced models containing documentation-density or workflow features. Performance was summarized using area under the receiver operating characteristic curve, Brier score where reported, delta AUROC, bootstrap confidence intervals where reported, and label-shuffle controls where available. Results The strongest external long-stay proxy evidence came from the NWICU chartevents analysis, which included 28,612 ICU stays, 20,267 stays with chart events, and 9,619,759 chart events. For ICU length of stay greater than the median, baseline AUROC was 0.5252. Enhanced AUROC was 0.9512 for Documentation Burden Score features, 0.9214 for Shift-End Documentation Rate features, 0.8470 for documentation reliability style features, and 0.9517 for all documentation features. Corresponding label-shuffle enhanced AUROCs were near random, ranging from 0.4897 to 0.5064. For ICU length of stay greater than the 75th percentile, baseline AUROC was 0.5155. Enhanced AUROC was 0.9433 for Documentation Burden Score features, 0.9194 for Shift-End Documentation Rate features, 0.8118 for documentation reliability style features, and 0.9427 for all documentation features, with label-shuffle enhanced AUROCs from 0.4836 to 0.4999. Additional retrospective support was observed in eICU workflow analyses, HiRID first-24-hour documentation-density analyses, MIMIC-IV HF ICU internal analyses, MIMIC-IV-Note metadata extensions, and nursing-chart or lab density proxy analyses. However, cross-institution discrimination transfer was weak without recalibration, and several analyses remained proxy validations rather than final clinical validations. Conclusions Documentation-density and documentation-timing features show promising retrospective signal for ICU workflow complexity and long-stay proxy outcomes, especially in NWICU chartevents and selected internal dataset-specific analyses. These findings support further preregistered, prospective, silent-mode validation of documentation-derived workflow intelligence. They do not establish prospective clinical performance, mortality reduction, clinician burden reduction, autonomous deterioration prediction, or deployment readiness.
Gjertsen, M.; Yoon, W.; Afshar, M.; Temte, B.; Leding, B.; Halliday, S.; Bradley, K.; Kim, J.; Mitchell, J.; Sanders, A. K.; Croxford, E. L.; Caskey, J.; Churpek, M. M.; Mayampurath, A.; Gao, Y.; Miller, T.; Kruser, J. M.
Show abstract
ImportancePhysicians routinely prognosticate to guide care delivery and shared decision making, particularly when caring for patients with critical illnesses. Yet, these physician estimates are prone to inaccuracy and uncertainty. Artificial intelligence, including large language models (LLMs), show promise in supporting or improving this prognostication. However, the performance of contemporary LLMs in prognosticating for the heterogeneous population of critically ill patients remains poorly understood. ObjectiveTo characterize and compare the performance of LLMs and physicians when predicting 6-month mortality for hospitalized adults who survived critical illness. DesignEmbedded mixed methods study with elicitation and comparison of prognostic estimates and reasoning from LLMs and practicing physicians. SettingThe publicly available, deidentified Medical Information Mart for Intensive Care (MIMIC)-IV v2.2 dataset. ParticipantsWe randomly selected 100 hospitalizations of adult survivors of critical illness. Four contemporary LLMs (Open AI GPT-4o, o3- and o4-mini, and DeepSeek-R1) and 7 physicians provided independent prognostic estimates for each case (1,100 total estimates; 400 LLM and 700 physician). Main outcomes and measuresFor each case, LLMs and physicians used the hospital discharge summary and demographics to predict 6-month mortality (yes/no) and provide their reasoning (free text). We assessed prognostic performance using accuracy, sensitivity, and specificity, and used inductive, qualitative content analysis to characterize reasonings. ResultsMean physician accuracy for predicting mortality was 70.1% (95% CI 63.7-76.4%), with sensitivity of 59.7% (95% CI 50.6-68.8%) and specificity of 80.6% (95% CI 71.7-88.2%). The top-performing LLM (OpenAI o4-mini) accuracy was 78.0% (95% CI 70.0-86.0%), with sensitivity of 80.0% (95% CI 67.4-90.2%) and specificity of 76.0% (95% CI 63.3-88.0%). The difference between mean physician and top-performing LLM accuracy was not statistically significant (p = 0.5). Qualitative analysis revealed similar patterns in LLM and physician expressed reasoning, except that physicians regularly and explicitly reported uncertainty while LLMs did not. Conclusion and RelevanceIn this study, LLMs and physicians achieved comparable, moderate performance in predicting 6-month mortality after critical illness, with similar patterns in expressed reasoning. Our findings suggest LLMs could be used to support prognostication in clinical practice but also raise safety concerns due to the lack of LLM uncertainty expression. KEY POINTSO_ST_ABSQuestionC_ST_ABSHow does large language model (LLM) prognostic accuracy and reasoning compare to physicians when predicting 6-month mortality for adult survivors of critical illness? FindingsIn this embedded mixed methods study, physicians and large language models had comparable, moderate prognostic accuracy with similar expressed reasoning patterns except that LLMs did not explicitly express uncertainty. MeaningLarge language models may be able to support physician prognostication, although the inability of LLMs to express uncertainty poses an important safety consideration.